Search CORE

25 research outputs found

Distributed k-core view materialization and maintenance for large dynamic graphs

Author: Aksu H.
Canim M.
Chang Y. C.
Korpeoglu I.
Ulusoy O.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2014
Field of study

Cataloged from PDF version of article.In graph theory, k-core is a key metric used to identify subgraphs of high cohesion, also known as the ‘dense’ regions of a graph. As the real world graphs such as social network graphs grow in size, the contents get richer and the topologies change dynamically, we are challenged not only to materialize k-core subgraphs for one time but also to maintain them in order to keep up with continuous updates. Adding to the challenge is that real world data sets are outgrowing the capacity of a single server and its main memory. These challenges inspired us to propose a new set of distributed algorithms for k-core view construction and maintenance on a horizontally scaling storage and computing platform. Our algorithms execute against the partitioned graph data in parallel and take advantage of k-core properties to aggressively prune unnecessary computation. Experimental evaluation results demonstrated orders of magnitude speedup and advantages of maintaining k-core incrementally and in batch windows over complete reconstruction. Our algorithms thus enable practitioners to create and maintain many k-core views on different topics in rich social network content simultaneously

Bilkent University Institutional Repository

HetFS: A heterogeneous file system for everyone

Author: A Aghayev
A-IA Wang
I Koltsidas
J No
J Yang
M Canim
S Pelley
X Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Storage devices have been getting more and more diverse during the last decade. The advent of SSDs made it painfully clear that rotating devices, such as HDDs or magnetic tapes, were lacking in regards to response time. However, SSDs currently have a limited number of write cycles and a significantly larger price per capacity, which has prevented rotational technologies from begin abandoned. Additionally, Non-Volatile Memories (NVMs) have been lately gaining traction, offering devices that typically outperform NAND-based SSDs but exhibit a full new set of idiosyncrasies. Therefore, in order to appropriately support this diversity, intelligent mechanisms will be needed in the near-future to balance the benefits and drawbacks of each storage technology available to a system. In this paper, we present a first step towards such a mechanism called HetFS, an extension to the ZFS file system that is capable of choosing the storage device a file should be kept in according to preprogrammed filters. We introduce the prototype and show some preliminary results of the effects obtained when placing specific files into different devices.The research leading to these results has received funding from the European Community under the BIGStorage ETN (Project 642963 of the H2020-MSCA-ITN-2014), by the Spanish Ministry of Economy and Competitiveness under the TIN2015-65316 grant and by the Catalan Government under the 2014-SGR- 1051 grant. To learn more about the BigStorage project, please visit http: //bigstorage-project.eu/.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Flash-based extended cache for higher throughput and faster recovery

Author: Canim M.
Gray J.
Koltsidas I.
Zhou Y.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

Graph aware caching policy for distributed graph stores

Author: Aksu H.
Canim M.
Chang Y.-C.
Korpeoglu I.
Ulusoy Ö.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Graph stores are becoming increasingly popular among NOSQL applications seeking flexibility and heterogeneity in managing linked data. Conceptually and in practice, applications ranging from social networks, knowledge representations to Internet of things benefit from graph data stores built on a combination of relational and non-relational technologies aimed at desired performance characteristics. The most common data access pattern in querying graph stores is to traverse from a node to its neighboring nodes. This paper studies the impact of such traversal pattern to common data caching policies in a partitioned data environment where a big graph is distributed across servers in a cluster. We propose and evaluate a new graph aware caching policy designed to keep and evict nodes, edges and their metadata optimized for query traversal pattern. The algorithm distinguishes the topology of the graph as well as the latency of access to the graph nodes and neighbors. We implemented graph aware caching on a distributed data store Apache HBase in the Hadoop family. Performance evaluations showed up to 15x speedup on the benchmark datasets preferring our new graph aware policy over non-aware policies. We also show how to improve the performance of existing caching algorithms for distributed graphs by exploiting the topology information. © 2015 IEEE

Bilkent University Institutional Repository

Multi-resolution social network community identification and maintenance on big data platform

Author: Aksu H.
Canim M.
Chang Y.-C.
Korpeoglu I.
Ulusoy O.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Community identification in social networks is of great interest and with dynamic changes to its graph representation and content, the incremental maintenance of community poses significant challenges in computation. Moreover, the intensity of community engagement can be distinguished at multiple levels, resulting in a multi-resolution community representation that has to be maintained over time. In this paper, we first formalize this problem using the k-core metric projected at multiple k values, so that multiple community resolutions are represented with multiple k-core graphs. We then present distributed algorithms to construct and maintain a multi-k-core graph, implemented on the scalable big-data platform Apache HBase. Our experimental evaluation results demonstrate orders of magnitude speedup by maintaining multi-k-core incrementally over complete reconstruction. Our algorithms thus enable practitioners to create and maintain communities at multiple resolutions on different topics in rich social network content simultaneously. © 2013 IEEE

Crossref

Bilkent University Institutional Repository

Efficient community identification and maintenance at multiple resolutions on distributed datastores

Author: Aksu H.
Canim M.
Chang Y.-C.
Korpeoglu I.
Ulusoy Ö.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The topic of network community identification at multiple resolutions is of great interest in practice to learn high cohesive subnetworks about different subjects in a network. For instance, one might examine the interconnections among web pages, blogs and social content to identify pockets of influencers on subjects like 'Big Data', 'smart phone' or 'global warming'. With dynamic changes to its graph representation and content, the incremental maintenance of a community poses significant challenges in computation. Moreover, the intensity of community engagement can be distinguished at multiple levels, resulting in a multi-resolution community representation that has to be maintained over time. In this paper, we first formalize this problem using the k-core metric projected at multiple k-values, so that multiple community resolutions are represented with multiple k-core graphs. Recognizing that large graphs and their even larger attributed content cannot be stored and managed by a single server, we then propose distributed algorithms to construct and maintain a multi-k-core graph, implemented on the scalable Big Data platform Apache HBase. Our experimental evaluation results demonstrate orders of magnitude speedup by maintaining multi-k-core incrementally over complete reconstruction. Our algorithms thus enable practitioners to create and maintain communities at multiple resolutions on multiple subjects in rich network content simultaneously. © 2015 Elsevier B.V. All rights reserved

Bilkent University Institutional Repository

Glucose-6-Phosphate Dehydrogenase Deficiency and Malaria: A Method to Detect Primaquine-Induced Hemolysis in vitro

Author: Abamor Emrah Sefik
Allahverdiyev Adil M.
Ates Sezen Canim
Bagirova Malahat
Baydar Serap Yesilkir
Elcicek Serhat
Koc Rabia Cakir
Oztel Olga Nehir
Yaman Serkan
Publication venue: 'IntechOpen'
Publication date: 14/11/2012
Field of study

IntechOpen

Aldehyde Dehydrogenase: Cancer and Stem Cells

Author: Abamor Emrah Sefik
Allahverdiyev Adil M.
Ates Sezen Canim
Bagirova Malahat
Baydar Serap Yesilkir
Elcicek Serhat
Koc Rabia Cakir
Oztel Olga Nehir
Yaman Serkan
Publication venue: 'IntechOpen'
Publication date: 14/11/2012
Field of study

IntechOpen

Glucose-6-Phosphate Dehydrogenase Deficiency and Malaria: A Method to Detect Primaquine-Induced Hemolysis in vitro

Author: Adil M Allahverdiyev
Emrah Sefik Abamor Serkan Yaman
Malahat Bagirova
Olga Nehir Oztel
Rabia Cakir Koc
Serap Yesilkir Baydar
Serhat Elcicek
Sezen Canim Ates
Publication venue
Publication date: 03/04/2020
Field of study

CiteSeerX

Routes for breaching and protecting genetic privacy

Author: A Acquisti
A Cavoukian
A Kong
A Machanavajjhala
A Narayanan
AD Johnson
AJ Pakstis
AK Manning
AL McGuire
Arvind Narayanan
B Fons
B Malin
B Malin
BA Malin
BM Henn
C Dwork
C Shannon
CD Huff
D Clayton
D He
D Zubakov
DJ Solve
DR Nyholt
DW Craig
EA Zerhouni
EE Schadt
EM Ramos
F Liu
G Church
H Lango Allen
H Li
HK Im
HS Venter
J Burn
J Gitschier
J Kaiser
J Kaye
J Kaye
J Lee
J Marchini
JE Lunshof
JH Park
JM Oliver
JP Roberts
K Benitez
K El Emam
K El Emam
K Silventoinen
KA Tryka
KB Jacobs
KS Kendler
L Kamm
L Sweeney
L Sweeney
LA Sweeney
LA Sweeney
LAP Kohn
LL Rodriguez
M Canim
M Gymrek
M Gymrek
M Kantarcioglu
M Kayser
MD Mailman
N Chatterjee
N Homer
NN Taleb
P Bohannon
P Kwok
P Ohm
P Paillier
PM Visscher
R Braun
R Drmanac
R Khan
R Noumeir
RL Bennett
S Byers
S McClure
S Sankararaman
S Walsh
SE Brenner
SF Terry
SH Friend
T Lumley
TE King
TE King
V Bafna
W Fu
W Hartzog
WG Hill
WW Lowrance
XL Ou
Yaniv Erlich
Z Lin
Publication venue
Publication date: 01/12/2013
Field of study

We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.Comment: Draft for comment

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

PubMed Central